Two Easy Improvements to Lexical Weighting
نویسندگان
چکیده
We introduce two simple improvements to the lexical weighting features of Koehn, Och, and Marcu (2003) for machine translation: one which smooths the probability of translating word f to word e by simplifying English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. These new variations lead to improvements of up to +0.8 BLEU, with an average improvement of +0.6 BLEU across two language pairs, two genres, and two translation systems.
منابع مشابه
The two be's of English
This qualitative study investigates the uses of be in Contemporary English. Based on this study, one easy claim and one more difficult claim are proposed. The easy claim is that the traditional distinction between be as a lexical verb and be as an auxiliary is faulty. In particular, 'copular-be', traditionally considered to be a lexical verb, is in fact a prototypi...
متن کاملSemantic Feature Analysis Treatment for Anomia of Two Nonfluent Persian-Speaking Aphasic Patients
Objectives: Semantic Feature Analysis was designed to improve lexical retrieval of aphasic patients via activation of semantic networks of the words. In this approach, the anomic patients are cured with semantic information to assist oral naming. The purpose of this study was to examine the effects of Semantic Feature Analysis treatment on anomia of two nonfluent aphasic patients. Methods: A...
متن کاملCombining lexical and statistical translation evidence for cross-language information retrieval
This paper explores how best to use lexical and statistical translation evidence together for CrossLanguage Information Retrieval (CLIR). Lexical translation evidence is assembled from Wikipedia and from a large machine readable dictionary, statistical translation evidence is drawn from parallel corpora, and evidence from co-occurrence in the document language provides a basis for limiting the ...
متن کاملWorkshop Notes of the ECML / MLnet Workshop on Empirical Learning of Natural Language Processing Tasks
This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-oo smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can ooer the advantage of automatically specifying a suitable domain-speciic hierarchy between most speciic and...
متن کاملTopic Models for Dynamic Translation Model Adaptation
We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts, where topics are induced in an unsupervised way using topic models; this can be thought of as inducing subcorpora for adaptation without any human annotation. We use these topic distributions to compute topic-dependent lexical weighting probabilities and directly incorpo...
متن کامل